Skip to content

Enable architecture selection for DPCTL_TARGET_CUDA #2096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 19, 2025

Conversation

vlad-perevezentsev
Copy link
Collaborator

This PR proposes to change DPCTL_TARGET_CUDA CMake option from a boolean to a string allowing users to specify a CUDA architecture (e.g. sm_80). If not specified, it defaults to sm_50.

$ python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=<cuda_arch>"
# or
$ python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=ON"

The specified architecture is used to construct a SYCL alias target (e.g. nvidia_gpu_sm_80) and passed via -fsycl-targets option, following OneAPI for NVIDIA GPUs

Additionally removing DPCTL_TARGET_CUDA env handling logic

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?
  • If this PR is a work in progress, are you opening the PR as a draft?

Copy link

github-actions bot commented Jun 5, 2025

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

Copy link

github-actions bot commented Jun 5, 2025

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_8 ran successfully.
Passed: 1115
Failed: 6
Skipped: 119

@coveralls
Copy link
Collaborator

coveralls commented Jun 5, 2025

Coverage Status

coverage: 84.972% (-0.02%) from 84.989%
when pulling 5bf20fd on update_cuda_build
into 35a8c26 on master.

Copy link

github-actions bot commented Jun 5, 2025

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_9 ran successfully.
Passed: 1114
Failed: 7
Skipped: 119

Copy link

github-actions bot commented Jun 5, 2025

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_10 ran successfully.
Passed: 1114
Failed: 7
Skipped: 119

Copy link

github-actions bot commented Jun 6, 2025

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_17 ran successfully.
Passed: 1113
Failed: 8
Skipped: 119

Copy link

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_22 ran successfully.
Passed: 1115
Failed: 6
Skipped: 119

Copy link

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_30 ran successfully.
Passed: 1115
Failed: 6
Skipped: 119

Copy link
Collaborator

@ndgrigorian ndgrigorian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_34 ran successfully.
Passed: 1115
Failed: 6
Skipped: 119

Copy link

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_37 ran successfully.
Passed: 1114
Failed: 7
Skipped: 119

@ndgrigorian ndgrigorian merged commit 775d0d3 into master Jun 19, 2025
97 of 116 checks passed
@ndgrigorian ndgrigorian deleted the update_cuda_build branch June 19, 2025 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CUDA architecture to CMake option when building for NVidia devices
4 participants